Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
Add more filters










Publication year range
1.
BMC Bioinformatics ; 25(1): 110, 2024 Mar 12.
Article in English | MEDLINE | ID: mdl-38475691

ABSTRACT

BACKGROUND: The analysis of large and complex biological datasets in bioinformatics poses a significant challenge to achieving reproducible research outcomes due to inconsistencies and the lack of standardization in the analysis process. These issues can lead to discrepancies in results, undermining the credibility and impact of bioinformatics research and creating mistrust in the scientific process. To address these challenges, open science practices such as sharing data, code, and methods have been encouraged. RESULTS: CREDO, a Customizable, REproducible, DOcker file generator for bioinformatics applications, has been developed as a tool to moderate reproducibility issues by building and distributing docker containers with embedded bioinformatics tools. CREDO simplifies the process of generating Docker images, facilitating reproducibility and efficient research in bioinformatics. The crucial step in generating a Docker image is creating the Dockerfile, which requires incorporating heterogeneous packages and environments such as Bioconductor and Conda. CREDO stores all required package information and dependencies in a Github-compatible format to enhance Docker image reproducibility, allowing easy image creation from scratch. The user-friendly GUI and CREDO's ability to generate modular Docker images make it an ideal tool for life scientists to efficiently create Docker images. Overall, CREDO is a valuable tool for addressing reproducibility issues in bioinformatics research and promoting open science practices.


Subject(s)
Computational Biology , Software , Reproducibility of Results , Computational Biology/methods
2.
Sci Data ; 11(1): 159, 2024 Feb 02.
Article in English | MEDLINE | ID: mdl-38307867

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) has emerged as a vital tool in tumour research, enabling the exploration of molecular complexities at the individual cell level. It offers new technical possibilities for advancing tumour research with the potential to yield significant breakthroughs. However, deciphering meaningful insights from scRNA-seq data poses challenges, particularly in cell annotation and tumour subpopulation identification. Efficient algorithms are therefore needed to unravel the intricate biological processes of cancer. To address these challenges, benchmarking datasets are essential to validate bioinformatics methodologies for analysing single-cell omics in oncology. Here, we present a 10XGenomics scRNA-seq experiment, providing a controlled heterogeneous environment using lung cancer cell lines characterised by the expression of seven different driver genes (EGFR, ALK, MET, ERBB2, KRAS, BRAF, ROS1), leading to partially overlapping functional pathways. Our dataset provides a comprehensive framework for the development and validation of methodologies for analysing cancer heterogeneity by means of scRNA-seq.


Subject(s)
Benchmarking , Lung Neoplasms , Humans , Algorithms , Gene Expression Profiling/methods , Lung Neoplasms/genetics , Proto-Oncogene Proteins/genetics , Sequence Analysis, RNA/methods , Single-Cell Gene Expression Analysis , Cell Line, Tumor
3.
J Biomed Inform ; 148: 104546, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37984546

ABSTRACT

OBJECTIVE: Computational models are at the forefront of the pursuit of personalized medicine thanks to their descriptive and predictive abilities. In the presence of complex and heterogeneous data, patient stratification is a prerequisite for effective precision medicine, since disease development is often driven by individual variability and unpredictable environmental events. Herein, we present GreatNectorworkflow as a valuable tool for (i) the analysis and clustering of patient-derived longitudinal data, and (ii) the simulation of the resulting model of patient-specific disease dynamics. METHODS: GreatNectoris designed by combining an analytic strategy composed of CONNECTOR, a data-driven framework for the inspection of longitudinal data, and an unsupervised methodology to stratify the subjects with GreatMod, a quantitative modeling framework based on the Petri Net formalism and its generalizations. RESULTS: To illustrate GreatNectorcapabilities, we exploited longitudinal data of four immune cell populations collected from Multiple Sclerosis patients. Our main results report that the T-cell dynamics after alemtuzumab treatment separate non-responders versus responders patients, and the patients in the non-responders group are characterized by an increase of the Th17 concentration around 36 months. CONCLUSION: GreatNectoranalysis was able to stratify individual patients into three model meta-patients whose dynamics suggested insight into patient-tailored interventions.


Subject(s)
Precision Medicine , Humans , Workflow , Computer Simulation , Precision Medicine/methods , Cluster Analysis
4.
Bioinformatics ; 39(5)2023 05 04.
Article in English | MEDLINE | ID: mdl-37079732

ABSTRACT

MOTIVATION: The transition from evaluating a single time point to examining the entire dynamic evolution of a system is possible only in the presence of the proper framework. The strong variability of dynamic evolution makes the definition of an explanatory procedure for data fitting and clustering challenging. RESULTS: We developed CONNECTOR, a data-driven framework able to analyze and inspect longitudinal data in a straightforward and revealing way. When used to analyze tumor growth kinetics over time in 1599 patient-derived xenograft growth curves from ovarian and colorectal cancers, CONNECTOR allowed the aggregation of time-series data through an unsupervised approach in informative clusters. We give a new perspective of mechanism interpretation, specifically, we define novel model aggregations and we identify unanticipated molecular associations with response to clinically approved therapies. AVAILABILITY AND IMPLEMENTATION: CONNECTOR is freely available under GNU GPL license at https://qbioturin.github.io/connector and https://doi.org/10.17504/protocols.io.8epv56e74g1b/v1.


Subject(s)
Software , Humans , Animals , Cluster Analysis , Time Factors , Disease Models, Animal , Risk Assessment
5.
Methods Mol Biol ; 2584: 241-250, 2023.
Article in English | MEDLINE | ID: mdl-36495454

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) allows the creation of large collections of individual cells transcriptome. Unsupervised clustering is an essential element for the analysis of these data, and it represents the initial step for the identification of different cell types to investigate the cell subpopulation organization of a sample. In this chapter, we describe how to approach the clustering of single-cell RNAseq transcriptomics data using various clustering tools, and we provide some information on the limitations affecting the clustering procedure.


Subject(s)
Single-Cell Analysis , Single-Cell Gene Expression Analysis , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Cluster Analysis , Gene Expression Profiling/methods , Algorithms
6.
Methods Mol Biol ; 2584: 337-345, 2023.
Article in English | MEDLINE | ID: mdl-36495459

ABSTRACT

The idea behind novel single-cell RNA sequencing (scRNA-seq) pipelines is to isolate single cells through microfluidic approaches and generate sequencing libraries in which the transcripts are tagged to track their cell of origin. Modern scRNA-seq platforms are capable of analyzing up to many thousands of cells in each run. Then, combined with massive high-throughput sequencing producing billions of reads, scRNA-seq allows the assessment of fundamental biological properties of cell populations and biological systems at unprecedented resolution.In this chapter, we describe how cell subpopulation discovery algorithms, integrated into rCASC, could be efficiently executed on cloud-HPC infrastructure. To achieve this task, we focus on the StreamFlow framework which provides container-native runtime support for scientific workflows in cloud/HPC environments.


Subject(s)
Algorithms , Software , Workflow , High-Throughput Nucleotide Sequencing , Single-Cell Analysis , Sequence Analysis, RNA
7.
Int J Mol Sci ; 25(1)2023 Dec 29.
Article in English | MEDLINE | ID: mdl-38203629

ABSTRACT

Among the several mechanisms accounting for endocrine resistance in breast cancer, autophagy has emerged as an important player. Previous reports have evidenced that tamoxifen (Tam) induces autophagy and activates transcription factor EB (TFEB), which regulates the expression of genes controlling autophagy and lysosomal biogenesis. However, the mechanisms by which this occurs have not been elucidated as yet. This investigation aims at dissecting how TFEB is activated and contributes to Tam resistance in luminal A breast cancer cells. TFEB was overexpressed and prominently nuclear in Tam-resistant MCF7 cells (MCF7-TamR) compared with their parental counterpart, and this was not dependent on alterations of its nucleo-cytoplasmic shuttling. Tam promoted the release of lysosomal Ca2+ through the major transient receptor potential cation channel mucolipin subfamily member 1 (TRPML1) and two-pore channels (TPCs), which caused the nuclear translocation and activation of TFEB. Consistently, inhibiting lysosomal calcium release restored the susceptibility of MCF7-TamR cells to Tam. Our findings demonstrate that Tam drives the nuclear relocation and transcriptional activation of TFEB by triggering the release of Ca2+ from the acidic compartment, and they suggest that lysosomal Ca2+ channels may represent new druggable targets to counteract the onset of autophagy-mediated endocrine resistance in luminal A breast cancer cells.


Subject(s)
Calcium , Neoplasms , Tamoxifen/pharmacology , Calcium, Dietary , Autophagy , Lysosomes
8.
Gigascience ; 112022 08 10.
Article in English | MEDLINE | ID: mdl-35946989

ABSTRACT

BACKGROUND: Spatial transcriptomics (ST) combines stained tissue images with spatially resolved high-throughput RNA sequencing. The spatial transcriptomic analysis includes challenging tasks like clustering, where a partition among data points (spots) is defined by means of a similarity measure. Improving clustering results is a key factor as clustering affects subsequent downstream analysis. State-of-the-art approaches group data by taking into account transcriptional similarity and some by exploiting spatial information as well. However, it is not yet clear how much the spatial information combined with transcriptomics improves the clustering result. RESULTS: We propose a new clustering method, Stardust, that easily exploits the combination of space and transcriptomic information in the clustering procedure through a manual or fully automatic tuning of algorithm parameters. Moreover, a parameter-free version of the method is also provided where the spatial contribution depends dynamically on the expression distances distribution in the space. We evaluated the proposed methods results by analyzing ST data sets available on the 10x Genomics website and comparing clustering performances with state-of-the-art approaches by measuring the spots' stability in the clusters and their biological coherence. Stability is defined by the tendency of each point to remain clustered with the same neighbors when perturbations are applied. CONCLUSIONS: Stardust is an easy-to-use methodology allowing to define how much spatial information should influence clustering on different tissues and achieving more stable results than state-of-the-art approaches.


Subject(s)
Data Analysis , Transcriptome , Algorithms , Cluster Analysis
9.
Int J Mol Sci ; 22(23)2021 Nov 25.
Article in English | MEDLINE | ID: mdl-34884559

ABSTRACT

BACKGROUND: Biological processes are based on complex networks of cells and molecules. Single cell multi-omics is a new tool aiming to provide new incites in the complex network of events controlling the functionality of the cell. METHODS: Since single cell technologies provide many sample measurements, they are the ideal environment for the application of Deep Learning and Machine Learning approaches. An autoencoder is composed of an encoder and a decoder sub-model. An autoencoder is a very powerful tool in data compression and noise removal. However, the decoder model remains a black box from which is impossible to depict the contribution of the single input elements. We have recently developed a new class of autoencoders, called Sparsely Connected Autoencoders (SCA), which have the advantage of providing a controlled association among the input layer and the decoder module. This new architecture has the benefit that the decoder model is not a black box anymore and can be used to depict new biologically interesting features from single cell data. RESULTS: Here, we show that SCA hidden layer can grab new information usually hidden in single cell data, like providing clustering on meta-features difficult, i.e. transcription factors expression, or not technically not possible, i.e. miRNA expression, to depict in single cell RNAseq data. Furthermore, SCA representation of cell clusters has the advantage of simulating a conventional bulk RNAseq, which is a data transformation allowing the identification of similarity among independent experiments. CONCLUSIONS: In our opinion, SCA represents the bioinformatics version of a universal "Swiss-knife" for the extraction of hidden knowledgeable features from single cell omics data.


Subject(s)
Adenocarcinoma of Lung/pathology , Cluster Analysis , Computational Biology/methods , Lung Neoplasms/pathology , Machine Learning , Neural Networks, Computer , Single-Cell Analysis/methods , Adenocarcinoma of Lung/genetics , Humans , Lung Neoplasms/genetics , Exome Sequencing
10.
Br J Haematol ; 194(2): 378-381, 2021 07.
Article in English | MEDLINE | ID: mdl-34002365

ABSTRACT

Minimal residual disease (MRD) determined by classic polymerase chain reaction (PCR) methods is a powerful outcome predictor in mantle cell lymphoma (MCL). Nevertheless, some technical pitfalls can reduce the rate of of molecular markers. Therefore, we applied the EuroClonality-NGS IGH (next-generation sequencing immunoglobulin heavy chain) method (previously published in acute lymphoblastic leukaemia) to 20 MCL patients enrolled in an Italian phase III trial sponsored by Fondazione Italiana Linfomi. Results from this preliminary investigation show that EuroClonality-NGS IGH method is feasible in the MCL context, detecting a molecular IGH target in 19/20 investigated cases, allowing MRD monitoring also in those patients lacking a molecular marker for classical screening approaches.


Subject(s)
Gene Rearrangement , High-Throughput Nucleotide Sequencing , Immunoglobulin Heavy Chains/genetics , Lymphoma, Mantle-Cell/genetics , Biomarkers, Tumor/genetics , Genes, Immunoglobulin , High-Throughput Nucleotide Sequencing/methods , Humans , Italy/epidemiology , Lymphoma, Mantle-Cell/diagnosis , Lymphoma, Mantle-Cell/epidemiology , Neoplasm, Residual/diagnosis , Neoplasm, Residual/epidemiology , Neoplasm, Residual/genetics
11.
Int J Mol Sci ; 22(8)2021 Apr 19.
Article in English | MEDLINE | ID: mdl-33921709

ABSTRACT

BACKGROUND: Disruption of alternative splicing (AS) is frequently observed in cancer and might represent an important signature for tumor progression and therapy. Exon skipping (ES) represents one of the most frequent AS events, and in non-small cell lung cancer (NSCLC) MET exon 14 skipping was shown to be targetable. METHODS: We constructed neural networks (NN/CNN) specifically designed to detect MET exon 14 skipping events using RNAseq data. Furthermore, for discovery purposes we also developed a sparsely connected autoencoder to identify uncharacterized MET isoforms. RESULTS: The neural networks had a Met exon 14 skipping detection rate greater than 94% when tested on a manually curated set of 690 TCGA bronchus and lung samples. When globally applied to 2605 TCGA samples, we observed that the majority of false positives was characterized by a blurry coverage of exon 14, but interestingly they share a common coverage peak in the second intron and we speculate that this event could be the transcription signature of a LINE1 (Long Interspersed Nuclear Element 1)-MET (Mesenchymal Epithelial Transition receptor tyrosine kinase) fusion. CONCLUSIONS: Taken together, our results indicate that neural networks can be an effective tool to provide a quick classification of pathological transcription events, and sparsely connected autoencoders could represent the basis for the development of an effective discovery tool.


Subject(s)
Deep Learning , Exons/genetics , Genetic Variation/genetics , Humans , Neural Networks, Computer
12.
Methods Mol Biol ; 2284: 181-192, 2021.
Article in English | MEDLINE | ID: mdl-33835443

ABSTRACT

Analysis of circular RNA (circRNA) expression from RNA-Seq data can be performed with different algorithms and analysis pipelines, tools allowing the extraction of heterogeneous information on the expression of this novel class of RNAs. Computational pipelines were developed to facilitate the analysis of circRNA expression by leveraging different public tools in easy-to-use pipelines. This chapter describes the complete workflow for a computationally reproducible analysis of circRNA expression starting for a public RNA-Seq experiment. The main steps of circRNA prediction, annotation, classification, sequence reconstruction, quantification, and differential expression are illustrated.


Subject(s)
Computational Biology/methods , RNA, Circular/analysis , RNA-Seq/methods , Algorithms , Datasets as Topic/statistics & numerical data , Humans , RNA, Circular/chemistry , RNA, Circular/genetics , RNA, Untranslated/analysis , RNA, Untranslated/chemistry , RNA, Untranslated/genetics , RNA-Seq/statistics & numerical data , Sequence Analysis, RNA , Software , Transcriptome
13.
Methods Mol Biol ; 2284: 289-301, 2021.
Article in English | MEDLINE | ID: mdl-33835449

ABSTRACT

Single-cell RNAseq data can be generated using various technologies, spanning from isolation of cells by FACS sorting or droplet sequencing, to the use of frozen tissue sections retaining spatial information of cells in their morphological context. The analysis of single cell RNAseq data is mainly focused on the identification of cell subpopulations characterized by specific gene markers that can be used to purify the population of interest for further biological studies. This chapter describes the steps required for dataset clustering and markers detection using a droplet dataset and a spatial transcriptomics dataset.


Subject(s)
Computational Biology/methods , RNA-Seq/methods , Single-Cell Analysis/methods , Cluster Analysis , Datasets as Topic , Gene Expression Profiling/methods , High-Throughput Nucleotide Sequencing/methods , Humans , Sequence Analysis, RNA/methods , Exome Sequencing/methods
14.
BMC Bioinformatics ; 22(1): 209, 2021 Apr 22.
Article in English | MEDLINE | ID: mdl-33888059

ABSTRACT

BACKGROUND: Graphs are mathematical structures widely used for expressing relationships among elements when representing biomedical and biological information. On top of these representations, several analyses are performed. A common task is the search of one substructure within one graph, called target. The problem is referred to as one-to-one subgraph search, and it is known to be NP-complete. Heuristics and indexing techniques can be applied to facilitate the search. Indexing techniques are also exploited in the context of searching in a collection of target graphs, referred to as one-to-many subgraph problem. Filter-and-verification methods that use indexing approaches provide a fast pruning of target graphs or parts of them that do not contain the query. The expensive verification phase is then performed only on the subset of promising targets. Indexing strategies extract graph features at a sufficient granularity level for performing a powerful filtering step. Features are memorized in data structures allowing an efficient access. Indexing size, querying time and filtering power are key points for the development of efficient subgraph searching solutions. RESULTS: An existing approach, GRAPES, has been shown to have good performance in terms of speed-up for both one-to-one and one-to-many cases. However, it suffers in the size of the built index. For this reason, we propose GRAPES-DD, a modified version of GRAPES in which the indexing structure has been replaced with a Decision Diagram. Decision Diagrams are a broad class of data structures widely used to encode and manipulate functions efficiently. Experiments on biomedical structures and synthetic graphs have confirmed our expectation showing that GRAPES-DD has substantially reduced the memory utilization compared to GRAPES without worsening the searching time. CONCLUSION: The use of Decision Diagrams for searching in biochemical and biological graphs is completely new and potentially promising thanks to their ability to encode compactly sets by exploiting their structure and regularity, and to manipulate entire sets of elements at once, instead of exploring each single element explicitly. Search strategies based on Decision Diagram makes the indexing for biochemical graphs, and not only, more affordable allowing us to potentially deal with huge and ever growing collections of biochemical and biological structures.


Subject(s)
Vitis , Abstracting and Indexing , Algorithms , Databases, Factual
15.
NPJ Syst Biol Appl ; 7(1): 1, 2021 01 05.
Article in English | MEDLINE | ID: mdl-33402683

ABSTRACT

Single-cell RNA sequencing (scRNAseq) is an essential tool to investigate cellular heterogeneity. Thus, it would be of great interest being able to disclose biological information belonging to cell subpopulations, which can be defined by clustering analysis of scRNAseq data. In this manuscript, we report a tool that we developed for the functional mining of single cell clusters based on Sparsely-Connected Autoencoder (SCA). This tool allows uncovering hidden features associated with scRNAseq data. We implemented two new metrics, QCC (Quality Control of Cluster) and QCM (Quality Control of Model), which allow quantifying the ability of SCA to reconstruct valuable cell clusters and to evaluate the quality of the neural network achievements, respectively. Our data indicate that SCA encoded space, derived by different experimentally validated data (TF targets, miRNA targets, Kinase targets, and cancer-related immune signatures), can be used to grasp single cell cluster-specific functional features. In our implementation, SCA efficacy comes from its ability to reconstruct only specific clusters, thus indicating only those clusters where the SCA encoding space is a key element for cells aggregation. SCA analysis is implemented as module in rCASC framework and it is supported by a GUI to simplify it usage for biologists and medical personnel.


Subject(s)
Data Mining/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Algorithms , Base Sequence/genetics , Cluster Analysis , Humans , Neural Networks, Computer , Software , Systems Biology/methods , Exome Sequencing/methods
16.
BMC Bioinformatics ; 21(Suppl 17): 550, 2020 Dec 14.
Article in English | MEDLINE | ID: mdl-33308135

ABSTRACT

BACKGROUND: Multiple Sclerosis (MS) represents nowadays in Europe the leading cause of non-traumatic disabilities in young adults, with more than 700,000 EU cases. Although huge strides have been made over the years, MS etiology remains partially unknown. Furthermore, the presence of various endogenous and exogenous factors can greatly influence the immune response of different individuals, making it difficult to study and understand the disease. This becomes more evident in a personalized-fashion when medical doctors have to choose the best therapy for patient well-being. In this optics, the use of stochastic models, capable of taking into consideration all the fluctuations due to unknown factors and individual variability, is highly advisable. RESULTS: We propose a new model to study the immune response in relapsing remitting MS (RRMS), the most common form of MS that is characterized by alternate episodes of symptom exacerbation (relapses) with periods of disease stability (remission). In this new model, both the peripheral lymph node/blood vessel and the central nervous system are explicitly represented. The model was created and analysed using Epimod, our recently developed general framework for modeling complex biological systems. Then the effectiveness of our model was shown by modeling the complex immunological mechanisms characterizing RRMS during its course and under the DAC administration. CONCLUSIONS: Simulation results have proven the ability of the model to reproduce in silico the immune T cell balance characterizing RRMS course and the DAC effects. Furthermore, they confirmed the importance of a timely intervention on the disease course.


Subject(s)
Immune System/physiology , Models, Biological , Multiple Sclerosis, Relapsing-Remitting/immunology , User-Computer Interface , Algorithms , Daclizumab/therapeutic use , Humans , Immunosuppressive Agents/therapeutic use , Multiple Sclerosis, Relapsing-Remitting/drug therapy , Multiple Sclerosis, Relapsing-Remitting/pathology , Stochastic Processes
17.
BMC Infect Dis ; 20(1): 798, 2020 Oct 28.
Article in English | MEDLINE | ID: mdl-33115434

ABSTRACT

BACKGROUND: Severe acute respiratory syndrome coronavirus 2 (SARS-COV-2), the causative agent of the coronavirus disease 19 (COVID-19), is a highly transmittable virus. Since the first person-to-person transmission of SARS-CoV-2 was reported in Italy on February 21st, 2020, the number of people infected with SARS-COV-2 increased rapidly, mainly in northern Italian regions, including Piedmont. A strict lockdown was imposed on March 21st until May 4th when a gradual relaxation of the restrictions started. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to understand the spread of the diseases and to evaluate social measures to counteract, mitigate or delay the spread of the epidemic. METHODS: This study presents an extended version of the Susceptible-Exposed-Infected-Removed-Susceptible (SEIRS) model accounting for population age structure. The infectious population is divided into three sub-groups: (i) undetected infected individuals, (ii) quarantined infected individuals and (iii) hospitalized infected individuals. Moreover, the strength of the government restriction measures and the related population response to these are explicitly represented in the model. RESULTS: The proposed model allows us to investigate different scenarios of the COVID-19 spread in Piedmont and the implementation of different infection-control measures and testing approaches. The results show that the implemented control measures have proven effective in containing the epidemic, mitigating the potential dangerous impact of a large proportion of undetected cases. We also forecast the optimal combination of individual-level measures and community surveillance to contain the new wave of COVID-19 spread after the re-opening work and social activities. CONCLUSIONS: Our model is an effective tool useful to investigate different scenarios and to inform policy makers about the potential impact of different control strategies. This will be crucial in the upcoming months, when very critical decisions about easing control measures will need to be taken.


Subject(s)
Communicable Disease Control/methods , Coronavirus Infections/epidemiology , Coronavirus Infections/prevention & control , Pandemics/prevention & control , Pneumonia, Viral/epidemiology , Pneumonia, Viral/prevention & control , Betacoronavirus/isolation & purification , COVID-19 , Carrier State/diagnosis , Carrier State/epidemiology , Coronavirus Infections/diagnosis , Coronavirus Infections/transmission , Disease Susceptibility/diagnosis , Disease Susceptibility/epidemiology , Humans , Italy/epidemiology , Models, Theoretical , Pneumonia, Viral/diagnosis , Pneumonia, Viral/transmission , Quarantine , SARS-CoV-2
18.
BMC Bioinformatics ; 21(Suppl 8): 344, 2020 Sep 16.
Article in English | MEDLINE | ID: mdl-32938370

ABSTRACT

BACKGROUND: Emerging and re-emerging infectious diseases such as Zika, SARS, ncovid19 and Pertussis, pose a compelling challenge for epidemiologists due to their significant impact on global public health. In this context, computational models and computer simulations are one of the available research tools that epidemiologists can exploit to better understand the spreading characteristics of these diseases and to decide on vaccination policies, human interaction controls, and other social measures to counter, mitigate or simply delay the spread of the infectious diseases. Nevertheless, the construction of mathematical models for these diseases and their solutions remain a challenging tasks due to the fact that little effort has been devoted to the definition of a general framework easily accessible even by researchers without advanced modelling and mathematical skills. RESULTS: In this paper we describe a new general modeling framework to study epidemiological systems, whose novelties and strengths are: (1) the use of a graphical formalism to simplify the model creation phase; (2) the implementation of an R package providing a friendly interface to access the analysis techniques implemented in the framework; (3) a high level of portability and reproducibility granted by the containerization of all analysis techniques implemented in the framework; (4) a well-defined schema and related infrastructure to allow users to easily integrate their own analysis workflow in the framework. Then, the effectiveness of this framework is showed through a case of study in which we investigate the pertussis epidemiology in Italy. CONCLUSIONS: We propose a new general modeling framework for the analysis of epidemiological systems, which exploits Petri Net graphical formalism, R environment, and Docker containerization to derive a tool easily accessible by any researcher even without advanced mathematical and computational skills. Moreover, the framework was implemented following the guidelines defined by Reproducible Bioinformatics Project so it guarantees reproducible analysis and makes simple the developed of new user-defined workflows.


Subject(s)
Computational Biology/methods , Computer Simulation/standards , Vaccination/methods , Whooping Cough/epidemiology , Adolescent , Child , Humans , Reproducibility of Results
19.
BMC Bioinformatics ; 20(Suppl 6): 623, 2019 Dec 10.
Article in English | MEDLINE | ID: mdl-31822261

ABSTRACT

BACKGROUND: Multiple Sclerosis (MS) is an immune-mediated inflammatory disease of the Central Nervous System (CNS) which damages the myelin sheath enveloping nerve cells thus causing severe physical disability in patients. Relapsing Remitting Multiple Sclerosis (RRMS) is one of the most common form of MS in adults and is characterized by a series of neurologic symptoms, followed by periods of remission. Recently, many treatments were proposed and studied to contrast the RRMS progression. Among these drugs, daclizumab (commercial name Zinbryta), an antibody tailored against the Interleukin-2 receptor of T cells, exhibited promising results, but its efficacy was accompanied by an increased frequency of serious adverse events. Manifested side effects consisted of infections, encephalitis, and liver damages. Therefore daclizumab has been withdrawn from the market worldwide. Another interesting case of RRMS regards its progression in pregnant women where a smaller incidence of relapses until the delivery has been observed. RESULTS: In this paper we propose a new methodology for studying RRMS, which we implemented in GreatSPN, a state-of-the-art open-source suite for modelling and analyzing complex systems through the Petri Net (PN) formalism. This methodology exploits: (a) an extended Colored PN formalism to provide a compact graphical description of the system and to automatically derive a set of ODEs encoding the system dynamics and (b) the Latin Hypercube Sampling with PRCC index to calibrate ODE parameters for reproducing the real behaviours in healthy and MS subjects.To show the effectiveness of such methodology a model of RRMS has been constructed and studied. Two different scenarios of RRMS were thus considered. In the former scenario the effect of the daclizumab administration is investigated, while in the latter one RRMS was studied in pregnant women. CONCLUSIONS: We propose a new computational methodology to study RRMS disease. Moreover, we show that model generated and calibrated according to this methodology is able to reproduce the expected behaviours.


Subject(s)
Computer Simulation , Multiple Sclerosis, Relapsing-Remitting , Computational Biology , Disease Progression , Female , Humans , Immunosuppressive Agents/therapeutic use , Multiple Sclerosis, Relapsing-Remitting/immunology , Multiple Sclerosis, Relapsing-Remitting/physiopathology , Pregnancy , Recurrence
20.
Gigascience ; 8(9)2019 09 01.
Article in English | MEDLINE | ID: mdl-31494672

ABSTRACT

BACKGROUND: Single-cell RNA sequencing is essential for investigating cellular heterogeneity and highlighting cell subpopulation-specific signatures. Single-cell sequencing applications have spread from conventional RNA sequencing to epigenomics, e.g., ATAC-seq. Many related algorithms and tools have been developed, but few computational workflows provide analysis flexibility while also achieving functional (i.e., information about the data and the tools used are saved as metadata) and computational reproducibility (i.e., a real image of the computational environment used to generate the data is stored) through a user-friendly environment. FINDINGS: rCASC is a modular workflow providing an integrated analysis environment (from count generation to cell subpopulation identification) exploiting Docker containerization to achieve both functional and computational reproducibility in data analysis. Hence, rCASC provides preprocessing tools to remove low-quality cells and/or specific bias, e.g., cell cycle. Subpopulation discovery can instead be achieved using different clustering techniques based on different distance metrics. Cluster quality is then estimated through the new metric "cell stability score" (CSS), which describes the stability of a cell in a cluster as a consequence of a perturbation induced by removing a random set of cells from the cell population. CSS provides better cluster robustness information than the silhouette metric. Moreover, rCASC's tools can identify cluster-specific gene signatures. CONCLUSIONS: rCASC is a modular workflow with new features that could help researchers define cell subpopulations and detect subpopulation-specific markers. It uses Docker for ease of installation and to achieve a computation-reproducible analysis. A Java GUI is provided to welcome users without computational skills in R.


Subject(s)
Sequence Analysis, RNA , Single-Cell Analysis , Workflow , Cluster Analysis , Humans , Leukocytes, Mononuclear/metabolism , Software
SELECTION OF CITATIONS
SEARCH DETAIL
...